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Mammalian cells contain thousands of copies of mitochondrial DNA (mtDNA). At birth, these are thought to 
be identical in most humans. Here, we use long read length ultra-deep resequencing-by-synthesis to interro- 
gate regions of the mtDNA genome from related and unrelated individuals at unprecedented resolution. We 
show that very low-level heteroplasmic variance is present in all tested healthy individuals, and is likely to be 
due to both inherited and somatic single base substitutions. Using this approach, we demonstrate an in- 
crease in mtDNA mutations in the skeletal muscle of patients with a proofreading-deficient mtDNA polymer- 
ase 7 due to POLG mutations. In contrast, we show that OPA1 mutations, which indirectly affect mtDNA 
maintenance, do not increase point mutation load. The demonstration of universal mtDNA heteroplasmy 
has fundamental implications for our understanding of mtDNA inheritance and evolution. Ostensibly de 
novo somatic mtDNA mutations, seen in mtDNA maintenance disorders and neurodegenerative disease 
and aging, will partly be due to the clonal expansion of low-level inherited variants. 



INTRODUCTION 

Nucleated mammalian cells contain thousands of copies of the 
mitochondrial genome. Until recently, it has been generally 
accepted that, for the vast majority of humans, all mitochon- 
drial DNA (mtDNA) molecules are identical at birth (homo- 
plasmy) (1,2) However, recent work has shown that ~25% 
of healthy individuals inherit a mixture of wild-type and 
variant mtDNA (heteroplasmy), which almost exclusively 
involves the non-coding mtDNA D-loop (3). Even less fre- 
quently, < 1 in 200 inherit a potentially pathogenic variant 
of mtDNA in the coding region (4). Being exclusively mater- 
nally inherited, mtDNA undergoes negligible recombination 
manifested at the population level. As a result, homoplasmic 
variation of mtDNA has played a key role in determining 
population migrations on a global scale (5) and in the confi- 
dent identification of biological samples in forensic medicine 



(6). Heteroplasmic variants can confound the situation in 
both circumstances. 

MtDNA is highly mutable, with an estimated mutation rate 
of at least ~5 — 15 times that of the nuclear genome. This is 
partly a result of proximity to the electron transport chain 
that is the major intracellular source of oxidative free radicals, 
and partly a result of the relatively limited mtDNA protection 
and repair mechanisms (7). The high mutation rate contributes 
to the high levels of mtDNA diversity and also the generation 
of somatic mtDNA mutations with aging (8). Once present, the 
intracellular level of both inherited and somatic mtDNA muta- 
tions can change during life through the unequal partitioning 
of mitochondrial genotypes, which can occur during cytokinesis 
in dividing cells (vegetative segregation) or within a non- 
dividing cell through the relaxed replication of mtDNA. An 
increase in heteroplasmy is often referred to as 'clonal expan- 
sion' that may be enhanced by a replicative advantage for 
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specific mutation types (9). If deleterious mutations exceed a 
critical threshold level within a cell, they can cause a biochem- 
ical defect in the mitochondrial respiratory chain (10). 
This often involves cytochrome c oxidase (COX), leads to 
increased free radical production, a defect of ATP synthesis, 
cell dysfunction and ultimately cell death (1). High percentage 
levels of inherited mtDNA mutations cause multisystem mito- 
chondrial diseases in ~ 1 in 10 000 of the population (11), and 
high percentage levels of somatic mutations have been 
described in several age-associated degenerative diseases 
and possibly contribute to the aging process (12). Understand- 
ing the origin of mtDNA variants, therefore, has far-reaching 
implications, with impact on medicine, human anthropology 
and forensic science. 

Although generally assumed that age-related mtDNA muta- 
tions originate during life, it is equally plausible that some os- 
tensibly somatic variants are actually inherited, but fall below 
the detection threshold of previously applied technologies. 
With the recent development of massively parallel sequencing, 
it is now possible to definitively address this issue through 
the detection of very low-level heteroplasmic variants. Recent 
studies using whole mtDNA genome sequencing on the 
Illumina GA/Solexa platforms have suggested that a single het- 
eroplasmic variant can be detected in ~25% of individuals 
(3,13-15). However, the relatively conservative detection 
thresholds (a minor allele frequency of > 1.5-10%) used in 
these studies mean that lower frequency variants were not 
reported, despite the fact that, from first principles, most 
somatic mtDNA variants will be expected to fall below this 
level. 

We, therefore, sought to improve the depth of resolution for 
very low-frequency heteroplasmy by developing an amplicon- 
based resequencing method on the Roche 454 GS FLX 
platform. Despite the very great depth of coverage per base 
position achieved in previous studies using other platforms 
(3,13-15), the achievable lower limit of resolution for hetero- 
plasmic mtDNA variants is pragmatically limited by the 
'noise' generated principally during the sequencing reaction. 
Amplicon resequencing on the 454 GS FLX platform has, 
thus, become the method of choice for ultra-deep resequencing 
(UDS) of mutants and quasi-species within other small 
genomes, for example viruses (16) and bacteria (17), where 
resolution in the 0.1-1% variant frequency range has been 
described. 

RESULTS 

Developing a highly sensitive and specific method 
of detecting very low-level mtDNA sequence variants 

At the population level, MT-HV2 within the mtDNA non- 
coding control region is observed to be more polymorphic 
than the coding region (18). We, therefore, designed two 
mtDNA amplicons to compare the base substitution rate 
in two regions in different subjects: one in MT-HV2 
(nt 162-455, 294 bp) and one in the coding region MT-C03 
(nt 9307-9591, 285 bp) (see Supplementary Material, 
Table SI). As nuclear pseudogenes [nuclear mitochondrial 
sequences (NumtS)] have the potential to confound deep rese- 
quencing of mtDNA, we took a three-stage process to exclude 



the possibility of this affecting our data. Firstly, we compared 
our amplicon loci with published NumtS to determine the 
extent to which they nested within these regions and the cor- 
responding degree of sequence identity (19). Only one of the 
NumtS had a high sequence identity with modern mtDNA 
(>90%) and was predicted to lie within our MT-C03 ampli- 
con (Supplementary Material, Table S2). However, this 
NumtS was still predicted to differ by six nucleotide substitu- 
tions from modern mtDNA, which significantly exceeded the 
number of variants ever seen on single reads within our 
dataset. Secondly, we used Primer-BLAST analysis of each 
amplicon primer pair (hgl9 build), which showed no predicted 
non-mtDNA PCR product generation (20). Finally, we 
attempted to PCR amplify a product using genomic DNA 
extracted from rhoO cells that contained no mtDNA, following 
ethidium bromide-induced mtDNA depletion to directly 
confirm our predictions and to look for unexpected NumtS 
hits. No product was obtained with either amplicon primer 
set, despite a strong band with a positive control (nuclear 
gene primers). We, therefore, concluded that nuclear pseudo- 
gene amplification was highly unlikely with the specific 
primers that we designed to amplify MT-HV2 and MT-C03. 
In the 454 assay, the mean depth of coverage using these 
primers was 8391 reads (range 4158-20 803). 

Although commercially available lambda phage controls can 
be used to estimate the level of background sequencing arti- 
facts, differences in guanine-cytosine content of the cloned tem- 
plates can influence the outcome. We, therefore, designed 
several in-house controls to determine the level of background 
noise using the 454 platform on mtDNA templates: (i) a nuclear 
DNA amplicon in BRCA2 (NC_000013.10, 32907099- 
32907295, 197 bp) was designed to compare the intrinsic prop- 
erties of mtDNA and nuclear genomic DNA, (ii) a cloned 
nuclear DNA template corresponding exactly to the BRCA2 
amplicons was generated from genomic DNA and (iii) we 
cloned mtDNA templates from exactly the same mtDNA 
sequences (MT-HV2, MT-C03). On this basis, any difference 
in sequence variants between the cloned and genomic DNA 
templates from the same individual is highly likely to reflect 
'biological' sequence variants generated in vivo and not tech- 
nical artifact arising from the sequencing process. 

Being based on pyrosequencing, the base-calling errors on 
the 454 platform are very largely dependent on poly- 
mononucleotide tracts (21). As our goal was to maximize spe- 
cificity in calling very low-level variants, we, therefore, 
excluded these tracts from all of the analysis. We also intro- 
duced a further quality-control step, requiring observed var- 
iants to be present in both forward and reverse reads at 
comparable frequencies. A 3-fold difference was permitted 
to allow for the effects of a binomial sampling distribution 
at very low variant levels. 

Using this highly stringent approach, there were no variants 
present at >0.2% heteroplasmy in any of the cloned or 
genomic DNA templates (Fig. 1). This demonstrated the low 
background noise level using the 454 UDS approach and indi- 
cated that variants detected at > 0.2% heteroplasmy are highly 
likely to be generated in vivo and to be of biological origin. 

Further validation of this approach came from repeated 
experiments performed using the same 454 platform, but dif- 
ferent chemistry (Titanium) and different amplicon generation 



386 Human Molecular Genetics, 2013, Vol. 22, No. 2 





100% 




90% 




80% 




70% 










o 


60% 


□ 








H 
a 


50% 


o 




tton 


40% 


por 




0 


30% 


a. 






20% 




10% 




0% 



I MT-HV2 Clone 
i MT-C03 clone 
» Autosomal clone 
I Autosomal 



J. 



O O O © O O O Q O O O' o o o ©• o* 

variant heteroplasmy level 

Figure 1. Resolution of ultra-deep sequencing-by-synthesis assay. Demonstra- 
tion of very low levels of noise in negative controls after quality-control filter- 
ing for poly-mononucleotide tracts and bidirectional validation of variants. An 
amplicon was produced from cloned DNA for each mtDNA amplicon 
(MT-HV2, MT-C03) along with an autosomal amplicon (BRCA2) and a 
clone. All negative controls showed minimal numbers of base positions with 
any variants, and none at >0.2% heteroplasmy level, with no inherent differ- 
ences between the different DNA templates. 



primers. Duplicate experiments were performed on four 
samples of genomic DNA. There was 100% consistency of 
variant detection between runs, but 6% of the variants 
present in the first run fell below the 0.2% threshold in the 
second run. The heteroplasmy level range for these variants 
in the second run was 0.12-0.19%, demonstrating that these 
fell just outside our defined threshold for detection, but were 
indeed present. 

Frequency of very low-level mtDNA sequence variants 
in healthy control subjects 

Next, we studied genomic DNA extracted from whole blood 
and the skeletal muscle from unrelated and related healthy 
individuals at different ages. Demographic and genotypic 
details of the subjects are shown in Supplementary Material, 
Table S3. Detected mtDNA sequence variants are shown in 
Supplementary Material, Table S4. 

All of the healthy control subjects showed variants at 
>0.2% heteroplasmy in both the blood and skeletal muscle 
DNA samples at one or more base positions. These data indi- 
cate that very low-level single-base heteroplasmy appears to 
be a universal finding among different control subjects 
(Fig. 2A and B). The total number of variant base positions 
was the same in the blood and skeletal muscle samples, but 
variants at higher percentage heteroplasmy levels (>2%) 
were present only in the skeletal muscle of some individuals 
(three of seven healthy subjects) and never seen in the 
blood. Higher level variants (>2%) were only found in the 
MT-HV2 amplicon. 



Overall, the number of variants per base position was sig- 
nificantly more common in the skeletal muscle than in the 
blood (P = 0.001), and this difference was greater for 
MT-HV2 than MT-C03 (P < 0.001). In addition, the number 
of variants per base position was significantly greater in 
MT-HV2 than MT-C03 (P < 0.001). In summary, the greatest 
frequency of very low-frequency variants was seen in the 
MT-HV2 region, and both the greatest number of different var- 
iants and the highest heteroplasmy levels were seen in the 
skeletal muscle. 



The frequency of very low-level variants in disorders 
of mtDNA maintenance 

We studied patients with inherited defects of POLG encoding 
the sole mtDNA polymerase 7, predicted to have error-prone 
mtDNA replication ((22-24). For comparison, we also 
studied subjects with defects of OPAL Skeletal muscle biop- 
sies from these individuals show an excess of cells containing 
clonally amplified mtDNA somatic mutations; however, 
unlike for POLG defects, the mechanism is believed to be in- 
direct, mediated through defects in mitochondrial fission and 
fusion, rather than by any direct effect on the polymerase 
(25-27). Clinical details and pedigrees are shown in Supple- 
mentary Material, Table S3 and Supplementary Material, 
Figure SI. 

First, we studied eight patients with a variety of POLG 
mutations (A467T/A467T, A467T/G848S, A467T/W748S, 
R627Q/W748S and R627Q/R1096H). In skeletal muscle biop- 
sies, the overall burden of mtDNA variants was greater in 
patients with POLG mutations when compared with healthy 
control subjects, but only for MT-HV2, where the majority 
of variants were seen (P = 0.004) (Fig. 2A). The excess of 
mtDNA variants was present across all levels of heteroplasmy, 
and increased significantly with age (r = 0.86, P= 0.006), 
suggestive of a time-dependent accumulation of low-level 
somatic mutations. There was no significant correlation of mu- 
tation burden with age in healthy controls (r = 0.62, P = 
0.14). Given that the mutational burden was significantly 
greater at lower heteroplasmy levels in POLG when compared 
with control subjects (for <1% heteroplasmy level variants, 
P = 0.019), these findings imply that that the higher mutation- 
al burden is primarily due to de novo somatic mutation in 
POLG patients. 

We, then, studied multiple tissues from patients with muta- 
tions in OPA1 (MlfsX208, S545R, V294fsX667, R905X and 
intron 8 and intron 15 splicing defects). In contrast to the 
POLG patients, there was no significant increase in mutational 
burden in OPA1 patients when compared with healthy controls 
in either the blood or skeletal muscle for the MT-HV2 or 
MT-C03 amplicons (Fig. 2A and B). These findings argue 
against an increased rate of de novo mutagenesis in patients 
with OPA1 mutations, consistent with there being no direct 
effect on the polymerase 7 and in keeping with previous sug- 
gestions that accelerated clonal expansion of existing somatic 
mutations is responsible for the cellular cytochrome c oxidase 
defects (28). We studied the mutational burden in the skeletal 
muscle of one subject on two occasions 10 years apart. Six het- 
eroplasmic variants were detected in the first muscle sample of 
this individual. All six variants were also detectable at 
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Figure 2. Very low-level heteroplasmic mtDNA variance was detected by ultra-deep amplicon resequencing. (A) Comparison of variants detected in the skeletal 
muscle (skm) DNA within the MT-HV2 amplicon in different patient groups shows no difference between healthy control (n = 7) and OPA1 (n = 8) subjects, but 
a significant excess of variance at all heteroplasmy levels in POLG subjects (« = 8). (B) Comparison of variants detected in the blood DNA in healthy control 
(n = 7), POLG (n = 4) and OPAI (n = 7) subjects shows that variants are present, but less common than in the skeletal muscle DNA with the absence of higher 
level variants (>2% heteroplasmy) and no difference between patient groups. 



comparable levels in the follow-up muscle sample taken 10 
years later, although two variants in the second sample fell 
below the 0.2% threshold (0.17%, 0.06%). Overall, five variants 
showed a decrease in heteroplasmy level and one showed an in- 
crease over the 10 year period; however, the absolute changes 
were small (Supplementary Material, Table S5). 



Determining whether very low-level variants are 
maternally inherited 

Having identified low-level heteroplasmic mtDNA variants in 
the different pedigrees, we, then, compared related and unre- 
lated individuals to determine the extent to which the 
mtDNA variants arose through maternal transmission or 
somatic mutation. We focused on families with known 
nuclear genetic defects associated with mtDNA maintenance 
because of the larger number of variants available for scrutiny 
(i.e. increased signal strength). 

We compared 13 pairs of samples from 12 maternal first- 
degree relatives, including one subject with repeated blood 
and muscle samples taken 10 years apart. To determine the 
background level of similarity between unrelated individuals, 
we compared the same tissue between 79 different random 
pairs of unrelated subjects. Out of 485 variants, 59 variants 
were common in unrelated subjects (12%). In contrast, when 
the analysis was restricted to first-degree maternal relatives, 
25/63 variants were shared (40%, P< 0.001, Table 1). We 
also compared the frequency of sharing between 13 pairs of 
related individuals and 13 random pairs of unrelated controls. 
This also repeatedly revealed a highly significant difference 
(P < 0.001), adding further weight to our conclusions. On 
average, 39% of variants in a given individual were shared 
with their maternal relative. Interestingly, shared variants 
in related subjects were principally seen in the skeletal 
muscle mtDNA, where 17/24 variants were shared (71%), 
compared with 50/395 variants shared in unrelated subjects 



(13%, P < 0.001). In contrast, variant sharing in related sub- 
jects was rather uncommon in the blood mtDNA, where 8/ 
39 variants were shared (21%) when compared with 9/90 var- 
iants in unrelated subjects (10%, P = 0.10). 

These findings have two implications. First, some of the 
more common MT-HV2 variants are present in the background 
population at very low heteroplasmy levels. Second, 4-fold 
more variants are family specific and are transmitted 
between first-degree maternal relatives. Although it is not pos- 
sible, without longitudinal study, to state exactly what propor- 
tion of variants are due to somatic mutation, the data clearly 
show that a large proportion appear to be inherited at very 
low heteroplasmy levels. 



DISCUSSION 

Using amplicon-based UDS at unprecedented depth, we have 
shown that detectable very low-frequency mtDNA variants 
(0.2-2% heteroplasmy) are present in all tested healthy sub- 
jects. 

It is highly unlikely that nuclear pseudogene (NumtS) con- 
tamination influenced our results for the following reasons: 
Primer-BLAST analysis (hgl9 build) failed to identify any 
predicted non-mtDNA PCR products using our specific 
MT-HV2 and MT-C03 primers; we were unable to generate 
a product from rhoO genomic DNA using these primers. More- 
over, further evidence came from the post hoc analysis of in- 
dividual sequence reads and scrutiny of all predicted nuclear 
pseudogenes (Supplementary Material, Table S2) (19). 
Based on this analysis, a minimum of six changes from the 
mtDNA consensus sequence would be expected in a read 
from a pseudogene with our MT-C03 primers and ~48 
changes for the MT-HV2 primers. Read-by-read analysis 
never identified this number of variants, so pseudogene con- 
tamination is extremely unlikely in our dataset. We conclude, 
therefore, that heteroplasmy is a universal finding in humans. 



388 Human Molecular Genetics, 2013, Vol. 22, No. 2 

Table 1. Extent of sharing of variants in 13 pairs of maternally related and 79 pairs of unrelated samples 



Maternally related samples Unrelated samples P-value 

Shared variants Unshared variants Shared variants Shared variants Unshared variants Shared variants 
(#») (n) (%) («) («) (%) 



Skeletal 17 7 70.8 50 345 12.7 <0.0001 
muscle 

Blood 8 31 20.5 9 81 10.0 0.10 

All samples 25 38 39.7 59 426 12.2 <0.0001 



Although our approach does not allow us to state with absolute 
certainty whether any one variant is inherited or a somatic mu- 
tation, a comparison of maternal relatives shows substantial 
sharing between first-degree relatives, and our findings, there- 
fore, show that inherited variants make a significant contribu- 
tion to the overall mutation load in any one individual. 

None of the heteroplasmic variants detected were hap- 
logroup defining or haplogroup specific. Twelve of 40 base 
positions containing heteroplasmic variants are reported to 
be polymorphic (>1% of reported sequences with variants) 
at the population level (http://www.mtdb.igp.uu.se/). None of 
the variants detected have been ascribed definite pathogenic 
potential, although a single mutation at position m.9544 has 
previously been associated with optic neuropathy. We also 
examined whether variants detected had previously been 
reported as somatic mutations. In our data, 21 of 3 1 control 
region (MT-HV2) heteroplasmic variants were detected, and 
2 of 9 coding region (MT-C03) variants have previous been 
reported as somatic variants (http://www.mitomap.org/ 
MITOMAP) (Supplementary Material, Table S4). 

Using this approach, we also show that a disrupted mtDNA 
polymerase 7 due to a POLG defect leads to increased levels 
of mutations, and that these increase during life, mirroring 
observations in POLG-deficient mice (29,30). In contrast, we 
did not observe a correlation between age and mutation 
burden in control subjects, whether in the blood or muscle. 
This suggests that de novo mutagenesis throughout life is un- 
likely to contribute significantly to the cellular COX defects 
observed in healthy aged individuals. Rather, age-related 
COX defects are more likely to be the result of clonal expan- 
sions of mutations occurring during early life (31). This notion 
has recently been demonstrated in aged colonic crypt cells, 
where multimethod measurement of mtDNA mutation load 
failed to demonstrate an age-dependent increase (32). 
However, our observation of very low-frequency hetero- 
plasmy transmission indicates that that many of these muta- 
tions will be inherited down the maternal line. As a result, 
all clonally expanded species need not be the result of 
somatic mutation events in early life, and some could have 
arisen from low-level inherited variants. 

The changes in low-level heteroplasmy value between each 
mother and offspring (Supplementary Material, Figure SI and 
Supplementary Material, Table S4) are relatively small when 
compared with the shifts observed in patient pedigrees carry- 
ing higher levels of pathogenic mutations (33). However, 
this is consistent with the neutral drift theory (34) which pre- 
dicts that the variance in the offspring heteroplasmy due to the 
mtDNA bottleneck decreases as the mother's mutation level 
decreases. These are the first data to show the inheritance of 



such low-level mtDNA mutations. The high rate of inheritance 
of the low-level mutation in muscle (71%) may be surprising, 
considering our expectations of the effect of the mtDNA inher- 
itance bottleneck; however, two points should be considered 
here. First, that muscle data is from only two sibling pairs 
[subjects A-B and P-Q (Supplementary Material, Figure SI 
and Supplementary Material, Table S4)]. Second, as noted 
above, the neutral drift theory predicts that very low-level 
mutations will have low variance in the offspring, making 
them more likely to be preserved through the mtDNA inherit- 
ance bottleneck. 

The different patterns of mutation observed in the blood and 
skeletal muscle are, at least in part, likely to be a consequence 
of different rates of cell and mtDNA turnover between divid- 
ing (blood) and post mitotic tissues. Rapid turnover of blood 
cells can lead to the loss of mtDNA mutations during life, 
either through selection against a particular mutation (35) or 
simply by genetic drift (36). On the other hand, the loss of 
mutations is much less likely in a post mitotic tissue such as 
the skeletal muscle, where the replication of mtDNA can 
lead to an increase in mutation load during life within individ- 
ual cells and the tissue as a whole, even from very low levels 
of heteroplasmy (31). Thus, it is plausible that an inherited 
mutation is lost from the blood, but detected in the muscle, 
explaining why some inherited mutations are more likely to 
be detected in the muscle than in the blood. 

It is intriguing that the frequency of low-level variants in 
MT-HV2 is significantly greater than in MT-C03 in healthy 
control subjects (OR 3.3). Why should this be the case, 
given that our UDS of cloned mtDNA showed no intrinsic 
sequence-specific difference between the two templates? 
One possible explanation is that much of the non-coding 
D-loop heteroplasmy is actually inherited at a very low 
level. Being a non-coding region, these substitutions 
may be tolerated during transmission, unlike coding region 
variants that undergo strong negative selection during trans- 
mission (37). 

Perhaps most importantly, we show here that next gener- 
ation sequencing has the potential to reliably detect very 
low levels of heteroplasmy, when a very stringent analytical 
approach is employed. It is important to note that we did not 
perform a head-to-head comparison of different next gener- 
ation sequencing platforms, so it would be premature to con- 
clude that the 454 approach is superior to other platforms. 
However, using this method, we have gained novel insight 
into mtDNA within individuals and within pedigrees. Pro- 
spective studies in larger family-based cohorts will substanti- 
ate these findings. However, given that mtDNA 
heteroplasmy levels can change dramatically during life, and 
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during maternal transmission, the finding of universal mtDNA 
heteroplasmy has significant implications for our understand- 
ing of mtDNA at the population, family and individual 
level. If deleterious mutations are inherited, these have the po- 
tential to accumulate within single cells during life and thus 
contribute to neurodegenerative disease. Or, if they segregate 
rapidly through the mtDNA bottleneck, they could lead to a 
maternally inherited mitochondrial disorder. This places 
greater emphasis on the importance of developing techniques 
to prevent the transmission of mtDNA heteroplasmy and pre- 
venting the clonal expansion of pre-existing mtDNA muta- 
tions. 

MATERIALS AND METHODS 

Patient and samples 

All subjects gave informed consent for participation in re- 
search. Characteristics of patient and control samples are 
shown in Supplementary Material, Table S3 and Supplemen- 
tary Material, Figure SI. 



Molecular analyses 

Amplicon resequencing was performed on the Roche 454 GS 
FLX platform. Two mtDNA amplicons were used: MT-HV2 
(NC_012920, nt.162-455) and MT-C03 (nt.9307-9591). 
Confirmatory experiments on a subset of samples used the fol- 
lowing amplicon positions: MT-HV2 (nt. 109-483) and 
MT-C03 (nt. 9304-9653). Full primer sequences are shown 
in Supplementary Material, Table SI. Owing to the potential 
of nuclear pseudogenes to confound deep resequencing 
studies of mtDNA, we used BLAST to ensure that our 
primers avoided these areas and confirmed this by lack of 
amplification from mtDNA deplete rhoO DNA. Amplicon- 
specific mtDNA clones comprised the following inserts in a 
pGEM-T-easy vector (Promega): MT-HV2 clone (nt. 16548 - 
771) and MT-C03 clone (nt.9127-9661). Autosomal negative 
control comprised amplicon BRCA2 (NC_000013.10, 
32907099-32907295) and BRCA2 clone (32906828- 
32907480). Amplicon generation PCR reactions were per- 
formed in a 50 |xl volume comprising: lx buffer for KOD 
Hot Start DNA Polymerase (Novagen), 1.5 mM MgS0 4 , 
0.2 mM dNTPs, 0.3 |xm primers, 1 U KOD Hot Start DNA 
Polymerase (Novagen) and 100 ng DNA. Cycling conditions 
were 95°C for 2 min followed by 30 cycles of 95°C for 20 s, 
60°C for 10 s and 70°C for 4 s. EmPCR and bidirectional se- 
quencing were performed according to the manufacturer's 
instructions (Roche 454). 



Bioinformatic analyses 

Base calling and alignment were performed using the algo- 
rithm of PyroBayes and Mosaik. Subsequent analysis of var- 
iants was performed using the custom R library fiowgram. 
To maximize specificity for very low-level variants, base posi- 
tions within or immediately adjacent to poly-mononucleotide 
tracts were excluded from analysis. Variant calls were then 
validated by comparison of read directions. 



Primer-BLAST (20) analysis to confirm complete specifi- 
city of amplicon generation primer pairs was performed 
using standard stringent parameters: human genome assembly 
hgl9, blast E value 30 000and unintended targets with <6 
mismatches were considered. 



Statistical analyses 

Comparison of proportions of base positions between groups 
was done using the chi-squared test. 

SUPPLEMENTARY MATERIAL 

Supplementary Material is available at HMG online. 
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